High accuracy operon prediction method based on STRING database scores
نویسندگان
چکیده
We present a simple and highly accurate computational method for operon prediction, based on intergenic distances and functional relationships between the protein products of contiguous genes, as defined by STRING database (Jensen,L.J., Kuhn,M., Stark,M., Chaffron,S., Creevey,C., Muller,J., Doerks,T., Julien,P., Roth,A., Simonovic,M. et al. (2009) STRING 8-a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res., 37, D412-D416). These two parameters were used to train a neural network on a subset of experimentally characterized Escherichia coli and Bacillus subtilis operons. Our predictive model was successfully tested on the set of experimentally defined operons in E. coli and B. subtilis, with accuracies of 94.6 and 93.3%, respectively. As far as we know, these are the highest accuracies ever obtained for predicting bacterial operons. Furthermore, in order to evaluate the predictable accuracy of our model when using an organism's data set for the training procedure, and a different organism's data set for testing, we repeated the E. coli operon prediction analysis using a neural network trained with B. subtilis data, and a B. subtilis analysis using a neural network trained with E. coli data. Even for these cases, the accuracies reached with our method were outstandingly high, 91.5 and 93%, respectively. These results show the potential use of our method for accurately predicting the operons of any other organism. Our operon predictions for fully-sequenced genomes are available at http://operons.ibt.unam.mx/OperonPredictor/.
منابع مشابه
An Improved Genetic Algorithm for Operon Prediction
An operon is a fundamental unit of transcription and contains specific functional genes for the construction and regulation of networks at the whole genome level. The prediction of operons is critical to understanding gene regulation and function in newly sequenced genomes. As experimental methods for operon detection tend to be non-trivial and time-consuming, various methods have been used for...
متن کاملOperon prediction in Pyrococcus furiosus
Identification of operons in the hyperthermophilic archaeon Pyrococcus furiosus represents an important step to understanding the regulatory mechanisms that enable the organism to adapt and thrive in extreme environments. We have predicted operons in P.furiosus by combining the results from three existing algorithms using a neural network (NN). These algorithms use intergenic distances, phyloge...
متن کاملPrediction the risk of occupational accidents using ANFIS in AZARAB Company
Nowadays, none of the industries are willing to have accidents in their workplaces and use different tools in this regard. One of these tools, which is capable of identifying risks and inappropriate situations, is risk analysis. Due to the importance of job risk prediction and reduction of occupational injury in this study, job risk prediction using different neural network algorithms has been ...
متن کاملBinary particle swarm optimization for operon prediction
An operon is a fundamental unit of transcription and contains specific functional genes for the construction and regulation of networks at the entire genome level. The correct prediction of operons is vital for understanding gene regulations and functions in newly sequenced genomes. As experimental methods for operon detection tend to be nontrivial and time consuming, various methods for operon...
متن کاملPredicting the Operon Structure of Bacillus subtilis Using Operon Length, Intergene Distance, and Gene Expression Information
We predict the operon structure of the Bacillus subtilis genome using the average operon length, the distance between genes in base pairs, and the similarity in gene expression measured in time course and gene disruptant experiments. By expressing the operon prediction for each method as a Bayesian probability, we are able to combine the four prediction methods into a Bayesian classifier in a s...
متن کامل